Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Improve arity_assign: ~2x improvement if we share data. #1076

Merged
merged 8 commits into from
Jun 16, 2022

Conversation

ritchie46
Copy link
Collaborator

This continues upon #1073 (you may close that one if you prefer in one PR, otherwise I rebase).

I did a local benchmark to see what the cost of the new arity_assign would be in the case we share data. In polars expressions that is every first operation:

Gnuplot not found, using plotters backend
bench_1                 time:   [608.34 us 609.30 us 610.73 us]                    
                        change: [+44.741% +45.228% +45.698%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) high mild
  5 (5.00%) high severe

This showed, that in case of shared data, we first do a memcpy and this is ~2x slower than the kernels where we allocate a new memory buffer.

Luckily we can have best of both worlds. :)

This PR branches depending on the shared state:

  • in case we share state -> we alloc (and thus choose the fastest path possible)
  • in case we own -> we mutate in place (and thus choose the fastest path possible)

I also added a distinction between set_validity and with_validity as we now have mutable data, it is not always logical return a new PrimitiveArray.

@codecov
Copy link

codecov bot commented Jun 15, 2022

Codecov Report

Merging #1076 (97f3d39) into main (39db6fb) will decrease coverage by 0.05%.
The diff coverage is 62.76%.

@@            Coverage Diff             @@
##             main    #1076      +/-   ##
==========================================
- Coverage   81.01%   80.95%   -0.06%     
==========================================
  Files         366      366              
  Lines       35056    35134      +78     
==========================================
+ Hits        28399    28442      +43     
- Misses       6657     6692      +35     
Impacted Files Coverage Δ
src/array/primitive/mutable.rs 94.93% <45.45%> (-1.35%) ⬇️
src/compute/arity_assign.rs 72.05% <62.22%> (-20.54%) ⬇️
src/array/primitive/mod.rs 81.65% <66.66%> (-4.06%) ⬇️
src/array/boolean/mod.rs 86.18% <100.00%> (ø)
src/bitmap/utils/slice_iterator.rs 98.78% <0.00%> (+1.21%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 39db6fb...97f3d39. Read the comment docs.

Copy link
Owner

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it! This is great!

src/array/primitive/mod.rs Outdated Show resolved Hide resolved
src/array/primitive/mod.rs Outdated Show resolved Hide resolved
src/array/primitive/mutable.rs Show resolved Hide resolved
ritchie46 and others added 2 commits June 15, 2022 22:14
@ritchie46
Copy link
Collaborator Author

@jorgecarleitao in light of #1078 and the possible into_iter().collect() over bits call,

I added the allocation branch for the validity as well.

@jorgecarleitao jorgecarleitao merged commit d4873fc into jorgecarleitao:main Jun 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants